In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
translated by 谷歌翻译
Recommendation Systems (RSs) are ubiquitous in modern society and are one of the largest points of interaction between humans and AI. Modern RSs are often implemented using deep learning models, which are infamously difficult to interpret. This problem is particularly exasperated in the context of recommendation scenarios, as it erodes the user's trust in the RS. In contrast, the newly introduced Tsetlin Machines (TM) possess some valuable properties due to their inherent interpretability. TMs are still fairly young as a technology. As no RS has been developed for TMs before, it has become necessary to perform some preliminary research regarding the practicality of such a system. In this paper, we develop the first RS based on TMs to evaluate its practicality in this application domain. This paper compares the viability of TMs with other machine learning models prevalent in the field of RS. We train and investigate the performance of the TM compared with a vanilla feed-forward deep learning model. These comparisons are based on model performance, interpretability/explainability, and scalability. Further, we provide some benchmark performance comparisons to similar machine learning solutions relevant to RSs.
translated by 谷歌翻译
变形金刚是使用多层自我注意力头的神经网络模型。注意力是在变形金刚中实现的,作为“键”和“查询”的上下文嵌入。变形金刚允许从不同层重新集合注意力信息,并同时处理所有输入,在处理大量数据时,它们比复发性神经网络更方便。近年来,变形金刚在自然语言处理任务上表现出色。同时,已经做出了巨大的努力,以使变压器适应机器学习的其他领域,例如Swin Transformer和Decision Transformer。 Swin Transformer是一种有前途的神经网络体系结构,将图像像素分为小斑块,并在固定尺寸的(移位)窗口内应用本地自我发挥操作。决策变压器已成功地将变形金刚应用于离线增强学习,并表明来自Atari游戏的随机步行样本足以让代理商学习优化的行为。但是,将在线强化学习与变形金刚结合在一起是更具挑战性的。在本文中,我们进一步探讨了不修改强化学习政策的可能性,而仅使用Swin Transformer的自我发明体系结构代替卷积神经网络架构。也就是说,我们旨在改变代理商对世界的看法,而不是代理商如何计划世界。我们在街机学习环境中对49场比赛进行实验。结果表明,在街机学习环境中,使用SWIN Transform在强化学习中的评估得分明显更高。因此,我们得出的结论是,在线强化学习可以从用空间令牌嵌入来利用自我侵犯中受益。
translated by 谷歌翻译
Q学习是最著名的增强学习算法之一。使用神经网络开发该算法已经做出了巨大的努力。其中包括引导深度Q学习网络。它利用多个神经网络头将多样性引入Q学习。有时可以将多样性视为代理商在给定状态下可以采取的合理移动量,类似于RL勘探比的定义。因此,引导深度Q学习网络的性能与算法中的多样性水平深厚相关。在最初的研究中,有人指出,随机的先验可以提高模型的性能。在本文中,我们进一步探讨了用噪声代替先验的可能性,并从高斯分布中采样噪声,以将更多的多样性引入该算法。我们对Atari基准测试进行实验,并将我们的算法与原始算法和其他相关算法进行比较。结果表明,我们对自举的深Q学习算法的修改可在不同类型的Atari游戏中获得更高的评估得分。因此,我们得出的结论是,用噪声代替先验可以通过确保多样性的完整性来改善自举的深度Q学习的性能。
translated by 谷歌翻译
Networks have become indispensable and ubiquitous structures in many fields to model the interactions among different entities, such as friendship in social networks or protein interactions in biological graphs. A major challenge is to understand the structure and dynamics of these systems. Although networks evolve through time, most existing graph representation learning methods target only static networks. Whereas approaches have been developed for the modeling of dynamic networks, there is a lack of efficient continuous time dynamic graph representation learning methods that can provide accurate network characterization and visualization in low dimensions while explicitly accounting for prominent network characteristics such as homophily and transitivity. In this paper, we propose the Piecewise-Velocity Model (PiVeM) for the representation of continuous-time dynamic networks. It learns dynamic embeddings in which the temporal evolution of nodes is approximated by piecewise linear interpolations based on a latent distance model with piecewise constant node-specific velocities. The model allows for analytically tractable expressions of the associated Poisson process likelihood with scalable inference invariant to the number of events. We further impose a scalable Kronecker structured Gaussian Process prior to the dynamics accounting for community structure, temporal smoothness, and disentangled (uncorrelated) latent embedding dimensions optimally learned to characterize the network dynamics. We show that PiVeM can successfully represent network structure and dynamics in ultra-low two-dimensional spaces. It outperforms relevant state-of-art methods in downstream tasks such as link prediction. In summary, PiVeM enables easily interpretable dynamic network visualizations and characterizations that can further improve our understanding of the intrinsic dynamics of time-evolving networks.
translated by 谷歌翻译
GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of billion floating-point operations per second (GFLOPS) to generate speech waveforms in samplewise manner. This makes GAN vocoders still challenging to run on normal CPUs without accelerators or parallel computers. In this work, we propose a new architecture for GAN vocoders that mainly depends on recurrent and fully-connected networks to directly generate the time domain signal in framewise manner. This results in considerable reduction of the computational cost and enables very fast generation on both GPUs and low-complexity CPUs. Experimental results show that our Framewise WaveGAN vocoder achieves significantly higher quality than auto-regressive maximum-likelihood vocoders such as LPCNet at a very low complexity of 1.2 GFLOPS. This makes GAN vocoders more practical on edge and low-power devices.
translated by 谷歌翻译
近年来,临床语言处理引起了很多关注,导致了新的模型或疾病表型,死亡率预测和其他任务的方法。不幸的是,这些方法中的许多方法都经过不同的实验设置(例如数据源,培训和测试拆分,指标,评估标准等)的测试,从而使其难以比较方法并确定最新方法。为了解决这些问题并促进可重复性和比较,我们通过一组四个临床语言理解任务,标准培训,开发,验证和测试集介绍了临床语言理解评估(线索)基准,从模拟数据以及软件中得出的测试集工具包。我们希望这些数据能够在方法之间进行直接比较,提高可重复性,并减少为这些临床语言理解任务开发新型模型或方法的进入的障碍。
translated by 谷歌翻译
我们提出并展示了一种基于物理引导的机器学习的城市排水系统液压系统快速准确的替代建模的新方法。替代物是根据流体动力(HIFI)模型的一组有限的仿真结果训练的。与HIFI模型相比,我们的方法将模拟时间减少了一到两个数量级。因此,它比例如概念性水文模型,但它可以模拟排水网络的所有节点和链接中的水位,流和附加费,因此很大程度上保留了HIFI模型提供的细节水平。比较由替代物和HIFI模型模拟的时间序列,达到了0.9顺序的R2值。替代培训时间目前为一小时。但是,可以通过应用转移学习和图形神经网络来减少它们。我们的替代方法对于城市排水系统的初始设计阶段以及实时应用的互动讲习班将很有用。此外,我们的模型公式是通用的,未来的研究应调查其在模拟其他供水系统中的应用。
translated by 谷歌翻译
最近的图像匿名工作表明,生成的对抗性网络(GANS)可以生成近乎光容化的面对匿名的个人。但是,将这些网络扩展到整个人体仍然是一个具有挑战性和未解决的任务。我们提出了一种新的匿名化方法,用于为野外图像产生近乎光容化的人类。我们的设计的关键部分是通过在图像和规范3D表面之间的密集像素对应关系来引导对抗性网。我们介绍了各种表面自适应调制(V-SAM),该调制(V-SAM)在整个发电机中嵌入表面信息。通过我们的新型鉴别器表面监控损失,发电机可以合成高质量的人类,在复杂和不同的场景中具有多样化的外观。我们展示了这种表面指导显着提高了样本的图像质量和多样性,产生了高度实用的发电机。最后,我们证明了表面引导的匿名化保留了未来计算机视觉开发数据的可用性
translated by 谷歌翻译
作为工业机器人的一般趋势,正在开发或重新设计的安全功能越来越多的安全功能,而不是通过安全继电器或互锁电路等物理硬件处理。这一趋势强化了补充基于传统,基于输入的测试和质量手术的重要性,这些测试和质量程序在今天广泛应用于行业,具有正式的验证和模型检查方法。为此,本文侧重于ABB工业涂料机器人中的代表性安全关键系统,即高压静电控制系统(HVC)。 HVC产生的高压的实际收敛性,对于安全操作必不可少,使用新颖的和一般共同验证框架正式验证,其中硬件和软件模型通过平台映射相关。这种方法使得具有高度多样化和专业的工具的务实组合。本文的主要贡献包括有关如何在工具之间传输硬件抽象和验证结果的详细信息,以便验证系统级安全性。值得注意的是,本文中考虑的HVC应用程序具有相当通用的反馈控制器形式。因此,这里报告的共同验证框架和经验对跟踪设定值引用的任何网络物理系统也非常相关。
translated by 谷歌翻译